Ahmet Mahmut Gokkaya
Blog Books

Math - Linear Algebra¶

Linear algebra is a branch of mathematics that deals with the theory and application of linear equations, linear transformations, vector spaces, and matrices. It is a fundamental tool in many fields, including physics, engineering, economics, and computer science. Some of the main concepts and methods in linear algebra include:

Vectors: A vector is an element of a vector space, which is a collection of objects that can be added and scaled by numbers. Vectors can be represented in Cartesian coordinates and have operations like addition and scalar multiplication.

Matrices: A matrix is a rectangular array of numbers, variables, or expressions arranged in rows and columns. Matrices can be used to represent linear transformations and systems of linear equations.

Linear equations: A linear equation is an equation of the form ax + by = c, where a, b, and c are constants and x and y are variables. Linear equations can be represented in matrix form as Ax = b, where A is a matrix of coefficients, x is a vector of variables, and b is a vector of constants.

Determinants and inverses: The determinant of a square matrix is a scalar value that can be used to find the area of a parallelogram or volume of a parallelepiped. The inverse of a square matrix, if it exists, is a matrix that when multiplied by the original matrix, gives the identity matrix.

Eigenvalues and eigenvectors: An eigenvalue of a matrix is a scalar that, when substituted into the matrix equation Ax = λx, gives a non-zero solution x. An eigenvector is a non-zero vector that satisfies the equation Ax = λx, where λ is an eigenvalue.

Linear transformations: A linear transformation is a function that takes a vector in one vector space to a vector in another vector space and preserves the operations of vector addition and scalar multiplication. Linear transformations can be represented by matrices.

Linear algebra is a powerful tool for solving a wide variety of mathematical problems and it has many practical applications in fields such as computer graphics, machine learning, and signal processing.

Vectors¶

Definition:

Vectors are mathematical objects that can be used to represent quantities with both magnitude and direction. They are typically represented by an arrow pointing in a certain direction, with the length of the arrow representing the magnitude of the vector.

Vectors can be represented in various ways, such as Cartesian coordinates, cylindrical coordinates and spherical coordinates. Vectors can be added and subtracted by combining their individual components and also can be multiplied by scalars, resulting in a vector that points in the same direction but has a different magnitude.

Some important properties and operations of vectors include:

Vector addition: The sum of two vectors is a vector that starts at the starting point of the first vector and ends at the endpoint of the second vector. This is also called the parallelogram law.

Scalar multiplication: Multiplying a vector by a scalar changes the magnitude of the vector while preserving its direction.

Dot product (scalar product): The dot product of two vectors is a scalar that can be used to find the angle between the vectors and the length of the projection of one vector onto the other.

Cross product (vector product): The cross product of two vectors is a vector that is perpendicular to both the input vectors and its magnitude is equal to the area of the parallelogram formed by the two input vectors.

Magnitude: The magnitude of a vector is a scalar that represents the length of the vector. It can be found by taking the square root of the dot product of a vector with itself.

Unit vector: A vector whose magnitude is 1 is called a unit vector.

Orthogonal vectors: Two vectors are orthogonal if the angle between them is 90 degrees.

Linear independence: Two vectors are said to be linearly independent if they are not parallel and do not lie on the same line.

Velocity=\begin{pmatrix}10 \\50 \\5000 \\\end{pmatrix}

Velocity is a vector quantity that describes the rate of change of an object's position in a given direction. It is typically represented by an arrow pointing in the direction of motion, with the length of the arrow representing the speed of the object. Velocity is measured in units of distance per unit of time, such as meters per second (m/s) or miles per hour (mph). It is the derivative of position with respect to time

Purpose¶

A vector is a mathematical object used to represent a quantity that has both magnitude and direction. Vectors are commonly used in physics, engineering, and computer graphics to represent things like velocity, force, and position. They can be added, subtracted, and multiplied by scalars, and their properties can be analyzed using vector calculus.

Video = \begin{pmatrix} 11.5 \\ 7.2 \\ 2.15 \\ 8.0 \end{pmatrix}

The vector we have provided is a 4-dimensional vector. It is represented by 4 numbers in the form of a column matrix. The numbers in the vector can represent different quantities depending on the context in which the vector is used. For example, in a physics or engineering application, these numbers may represent the x, y, z, and t components of a position or velocity vector. In computer graphics, they may represent the red, green, blue, and alpha channels of a color.

11.5 minutes , 7.2 viewers , 2.15 per day average , 8 spams

Machine Learling can predict that there is 70% probability a spam , 28% clickbait , 2% good video

Class probabilities = \begin{pmatrix} 0.70 \\ 0.28 \\ 0.08 \end{pmatrix}

Vectors in python¶

In Python, vectors can be represented and manipulated using a variety of libraries, such as NumPy and SciPy.

The NumPy library is a powerful library for numerical computing in Python and provides a convenient way to represent and manipulate vectors.

In [1]:
[11.5, 7.2, 2.15, 8.0]
Out[1]:
[11.5, 7.2, 2.15, 8.0]

In NumPy, an ndarray stands for N-dimensional array, which is a powerful and efficient multi-dimensional array object that allows you to work with large arrays of homogeneous data (i.e., data of the same type, such as integers or floating point values). ndarray objects are used to store and manipulate large arrays of numerical data, and are the primary data structure used in NumPy.

In [2]:
import numpy as np
video = np.array([11.5, 7.2, 2.15, 8.0])
video
Out[2]:
array([11.5 ,  7.2 ,  2.15,  8.  ])

If we want to know size attribute we need to use video.size command

In [3]:
video.size
Out[3]:
4

In the vector you provided, vector = \begin{pmatrix} 11.5 \ 7.2 \ 2.15 \ 8.0 \end{pmatrix}

The notation video[2] is used to access the element at the 2nd index of the vector. In python, indexing starts from 0, so video[2] will retrieve the 3rd element of the vector, which is 2.15.

In other words, video[2] returns the value of the element of the vector which is located in the 2nd index position of the vector.

It should be noted that, the vector you provided is not an numpy array, it is a mathematical notation, so this notation will not work directly. If you want to use this notation, you need to convert the vector in numpy array first like this:

.

In [4]:
video[2]  # 3rd element
Out[4]:
2.15

Plotting vectors¶

There are several ways to plot vectors in Python, depending on the library you are using.

One popular library for plotting vectors is Matplotlib. You can use the quiver() function from the Matplotlib's pyplot module to plot vectors in 2D or 3D space. The quiver() function takes in the x and y or x, y, and z components of the vectors as separate arguments, as well as other optional arguments to customize the plot.

Here's an example of how to plot a single 2D vector using Matplotlib:

In [5]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

2D Vectors¶

  1. We should create two so simple 2D vectors to show:

In [6]:
u = np.array([2, 5])
v = np.array([3, 1])
  1. These vectors each have u and v symbols and they are 2 vectors. They are so easy to plot graphically on a 2D matplot graph, For example;

In [7]:
x_coords, y_coords = zip(u, v)
plt.scatter(x_coords, y_coords, color=["r","b"])
plt.axis([0, 9, 0, 6])
plt.grid()
plt.show()

Vectors must be represent wıth arrows, We need to create a conveinence function for a draw good arrows:

In [8]:
def plot_vector2d(vector2d, origin=[0, 0], **options):
    return plt.arrow(origin[0], origin[1], vector2d[0], vector2d[1],
              head_width=0.2, head_length=0.3, length_includes_head=True,
              **options)

We need to draw the vectors u and v as arrows:

In [9]:
plot_vector2d(u, color="r")
plot_vector2d(v, color="b")
plt.axis([0, 9, 0, 6])
plt.grid()
plt.show()

3D vectors¶

3D vectors are so easy to plot. Lets create two 3D vectors together.

In [10]:
a = np.array([2, 4, 6])
b = np.array([3, 7, 5])

Lets plot all of them with matplotlib Axes3D

In [11]:
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt

subplot3d = plt.subplot(111, projection='3d')
x_coords, y_coords, z_coords = zip(a,b)
subplot3d.scatter(x_coords, y_coords, z_coords)
subplot3d.set_zlim3d([0, 9])
plt.show()

We need to create small conveninence function and list of 3D vectors with a verical line.

In [12]:
def plot_vectors3d(ax, vectors3d, z0, **options):
    for v in vectors3d:
        x, y, z = v
        ax.plot([x,x], [y,y], [z0, z], color="gray", linestyle='dotted', marker=".")
    x_coords, y_coords, z_coords = zip(*vectors3d)
    ax.scatter(x_coords, y_coords, z_coords, **options)

subplot3d = plt.subplot(111, projection='3d')
subplot3d.set_zlim([0, 9])
plot_vectors3d(subplot3d, [a,b], 0, color=("r","b"))
plt.show()

NORM¶

A norm vector, also known as a unit vector, is a vector that has a magnitude (or length) of 1. It is a vector that points in the same direction as the original vector, but with a magnitude of 1. This makes it useful for many applications, such as normalizing other vectors, as it allows for consistent comparison of vectors with different magnitudes.

$$ \left \Vert \textbf{u} \right \| = \sqrt{\sum_{i}{\textbf{u}_i}^2} $$

We can implement all of them through python, formula that $$ \sqrt x = x^{\frac{1}{2}}$$

In [13]:
def vector_norm(vector):
    squares = [element**2 for element in vector]
    return sum(squares)**0.5

print("||", u, "|| =")
vector_norm(u)
|| [2 5] || =
Out[13]:
5.385164807134504

Therefore, the most efficient function is NumPy's norm fuctions, it is so available in the linalg module(Linear Algebra)

In [14]:
import numpy.linalg as LA
LA.norm(u)
Out[14]:
5.385164807134504

We can plot a little diagram to approve that all of lenght vector v is nearly $ \approx5.4$

In [15]:
radius = LA.norm(u)
plt.gca().add_artist(plt.Circle((0,0), radius, color="#DDDDDD"))
plot_vector2d(u, color="red")
plt.axis([0, 8.7, 0, 6])
plt.grid()
plt.show()

Look up, it looks wonderfull!

In Addition¶

In [16]:
print(" ", u)
print("+", v)
print("-"*10)
u + v
  [2 5]
+ [3 1]
----------
Out[16]:
array([5, 6])

Now, we'll see how vectors looks through graphically

In [17]:
plot_vector2d(u, color="r")
plot_vector2d(v, color="b")
plot_vector2d(v, origin=u, color="b", linestyle="dotted")
plot_vector2d(u, origin=v, color="r", linestyle="dotted")
plot_vector2d(u+v, color="g")
plt.axis([0, 9, 0, 7])
plt.text(0.7, 3, "u", color="r", fontsize=18)
plt.text(4, 3, "u", color="r", fontsize=18)
plt.text(1.8, 0.2, "v", color="b", fontsize=18)
plt.text(3.1, 5.6, "v", color="b", fontsize=18)
plt.text(2.4, 2.5, "u+v", color="g", fontsize=18)
plt.grid()
plt.show()

u+v=v+u We can see image that, following u then v leads off common point as following v and u

Vectors are also associative, meaning that

u+(v+w)=(u+v)+w

Geometric translation

In geometry, a translation is a transformation that moves every point of a figure or a space by a fixed distance in a certain direction. A translation can be represented by a vector, with the direction and magnitude of the vector indicating the direction and distance of the translation.

In [18]:
t1 = np.array([2, 0.25])
t2 = np.array([2.5, 3.5])
t3 = np.array([1, 2])

x_coords, y_coords = zip(t1, t2, t3, t1)
plt.plot(x_coords, y_coords, "c--", x_coords, y_coords, "co")

plot_vector2d(v, t1, color="r", linestyle=":")
plot_vector2d(v, t2, color="r", linestyle=":")
plot_vector2d(v, t3, color="r", linestyle=":")

t1b = t1 + v
t2b = t2 + v
t3b = t3 + v

x_coords_b, y_coords_b = zip(t1b, t2b, t3b, t1b)
plt.plot(x_coords_b, y_coords_b, "b-", x_coords_b, y_coords_b, "bo")

plt.text(4, 4.2, "v", color="r", fontsize=18)
plt.text(3, 2.3, "v", color="r", fontsize=18)
plt.text(3.5, 0.4, "v", color="r", fontsize=18)

plt.axis([0, 6, 0, 5])
plt.grid()
plt.show()

Multiplication by a scalar¶

In linear algebra, scalar multiplication is the operation of multiplying a matrix or a vector by a scalar (a single value). The result of a scalar multiplication is a new matrix or vector that has the same direction as the original, but with its magnitude multiplied by the scalar value.

For example, if we have a vector "v" with a magnitude of 5, and we multiply it by a scalar of 2, the resulting vector "v' will have a magnitude of 10 and will point in the same direction as the original vector.

In [19]:
print("1.5 *", u, "=")

1.5 * u
1.5 * [2 5] =
Out[19]:
array([3. , 7.5])

Graphically, scalar multiplication of a vector can be visualized by multiplying the magnitude of the vector by the scalar value and keeping the direction unchanged.

For example, if we have a vector "v" with a magnitude of 5, and we multiply it by a scalar of 2, the resulting vector "v' " will have a magnitude of 10 and will still point in the same direction as the original vector. Graphically, it would be shown as the vector v being scaled up by a factor of 2.

For a matrix, the scalar multiplication can be visualized as stretching or shrinking the matrix along all its axis by the same factor. It would be shown as the matrix being scaled up or down by a factor of k.

For example, let's scale up by a factor of k = 2.5:

In [20]:
k = 2.5
t1c = k * t1
t2c = k * t2
t3c = k * t3

plt.plot(x_coords, y_coords, "c--", x_coords, y_coords, "co")

plot_vector2d(t1, color="r")
plot_vector2d(t2, color="r")
plot_vector2d(t3, color="r")

x_coords_c, y_coords_c = zip(t1c, t2c, t3c, t1c)
plt.plot(x_coords_c, y_coords_c, "b-", x_coords_c, y_coords_c, "bo")

plot_vector2d(k * t1, color="b", linestyle=":")
plot_vector2d(k * t2, color="b", linestyle=":")
plot_vector2d(k * t3, color="b", linestyle=":")

plt.axis([0, 9, 0, 9])
plt.grid()
plt.show()

As we might guess, vector can dividing by a scalar that are equivalent to multiplying with its inverse:

$$ \dfrac{\textbf{u}}{\lambda} = \dfrac{1}{\lambda} \times \textbf{u} $$

Scalar multiplication is commutative: $\lambda \times \textbf{u} = \textbf{u} \times \lambda$

It is also associative: $ \lambda_1 \times (\lambda_2 \times \textbf{u}) = (\lambda_1 \times \lambda_2) \times \textbf{u}$

Finally, it is distributive over addition of vectors: $ \lambda \times (\textbf{u} + \textbf{v}) = \lambda \times \textbf{u} + \lambda \times \textbf{v}$

Zero, unit and normalized vectors¶

  • A zero-vector is a vector full of 0s.
  • A unit vector is a vector with a norm equal to 1. The normalized vector of a non-null vector u ,noted $\hat{\textbf{u}}$ , is the unit vector that points in the same direction as u . It is equal to: $ \hat{\textbf{u}} = \dfrac{\textbf{u}}{\left \Vert \textbf{u} \right \|}$
In [21]:
plt.gca().add_artist(plt.Circle((0,0),1,color='c'))
plt.plot(0, 0, "ko")
plot_vector2d(v / LA.norm(v), color="k")
plot_vector2d(v, color="b", linestyle=":")
plt.text(0.3, 0.3, "$\hat{u}$", color="k", fontsize=18)
plt.text(1.5, 0.7, "$u$", color="b", fontsize=18)
plt.axis([-1.5, 5.5, -1.5, 3.5])
plt.grid()
plt.show()

Dot product¶

Definition

The dot product (also known as the scalar product or inner product) is a binary operation that takes two vectors and returns a scalar value. It is a measure of the similarity between the two vectors, based on the angle between them.

The formula for the dot product of two vectors u and v is:

$$ \textbf{u} \cdot \textbf{v} = \left \Vert \textbf{u} \right \| \times \left \Vert \textbf{v} \right \| \times cos(\theta)$$

Where ||u|| and ||v|| are the magnitudes of the vectors u and v, and θ is the angle between them.

Alternatively, the dot product can be represented as the sum of the product of the corresponding components of the two vectors:

$u · v = u1*v1 + u2*v2 + ... + un*vn$

The dot product has several useful properties, such as:

  • It is commutative: u·v = v·u
  • It is distributive over vector addition: $u·(v+w) = u·v + u·w$
  • It can be used to find the angle between two vectors: $cos(θ) = u·v / (||u|| ||v||)$

The dot product is particularly useful in physics, engineering, and computer graphics, as it can be used to find the projection of one vector onto another, and in calculating the distance between two vectors.

In [22]:
def dot_product(v1, v2):
    return sum(v1i * v2i for v1i, v2i in zip(v1, v2))

dot_product(u, v)
Out[22]:
11

More efficent way is provided by NumpY by the dot function:

In [23]:
np.dot(u,v)
Out[23]:
11

Equivalently, you can use the dot method of ndarrays:

In [24]:
u.dot(v)
Out[24]:
11

Caution: the * operator will perform an elementwise multiplication, NOT a dot product:

In [25]:
print("  ",u)
print("* ",v, "(NOT a dot product)")
print("-"*10)

u * v
   [2 5]
*  [3 1] (NOT a dot product)
----------
Out[25]:
array([6, 5])

Main properties¶

The dot product (also known as the scalar product or inner product) has several useful properties that make it a powerful tool in many mathematical and applied fields. Some of the main properties are:

Commutativity: The dot product is commutative, meaning that the order of the vectors does not matter. That is, u·v = v·u.

  • Distributivity over vector addition: The dot product is distributive over vector addition, meaning that it can be split up and distributed over the vectors being added. That is, u·(v+w) = u·v + u·w

  • Linearity: The dot product is a linear operation, meaning that it satisfies the properties of additivity and homogeneity. That is, u·(v+w) = u·v + u·w and u·(cv) = cu·v, where c is a scalar.

  • Orthogonality: Two vectors are orthogonal if and only if their dot product is equal to zero. That is, u·v = 0 if and only if u and v are orthogonal.

  • Angle between two vectors: The dot product can be used to find the angle between two vectors. The formula is cos(θ) = u·v / (||u|| ||v||), where θ is the angle between two vectors u and v.

Length of projection: The dot product can be used to find the length of the projection of one vector onto another. The formula is $$projv(u) = (u·v/||v||^2)v$$

Distance between two vectors: The dot product can be used to calculate the distance between two vectors. The formula is $$d(u,v) = ||u-v|| = sqrt(||u||^2 + ||v||^2 - 2u·v)$$

Calculating the angle between vectors¶

$$ \theta = \arccos{\left ( \dfrac{\textbf{u} \cdot \textbf{v}}{\left \Vert \textbf{u} \right \| \times \left \Vert \textbf{v} \right \|} \right ) }$$

Note that if $$ \textbf{u} \cdot \textbf{v} = 0$$ it follow that $$ \theta = \dfrac{π}{2}$$ Let's use this formula to calculate the angle between u and v (in radians)

In [26]:
def vector_angle(u, v):
    cos_theta = u.dot(v) / LA.norm(u) / LA.norm(v)
    return np.arccos(np.clip(cos_theta, -1, 1))

theta = vector_angle(u, v)
print("Angle =", theta, "radians")
print("      =", theta * 180 / np.pi, "degrees")
Angle = 0.8685393952858895 radians
      = 49.76364169072618 degrees

These are some of the main properties of the dot product, and it is widely used in fields such as physics, engineering, and computer graphics.

The cosine of an angle θ (often represented as cos(θ)) is a value between -1 and 1 that describes the relationship between the angle and the length of the adjacent side of a right triangle (the side adjacent to the angle) divided by the length of the hypotenuse (the longest side).

Projecting a point onto an axis¶

Given by this formula $$\textbf{proj}_{\textbf{u}}{\textbf{v}} = \dfrac{\textbf{u} \cdot \textbf{v}}{\left \Vert \textbf{u} \right \| ^2} \times \textbf{u}$$

And equivalent to : $$ \textbf{proj}_{\textbf{u}}{\textbf{v}} = (\textbf{v} \cdot \hat{\textbf{u}}) \times \hat{\textbf{u}}$$

In [27]:
u_normalized = u / LA.norm(u)
proj = v.dot(u_normalized) * u_normalized

plot_vector2d(u, color="r")
plot_vector2d(v, color="b")

plot_vector2d(proj, color="k", linestyle=":")
plt.plot(proj[0], proj[1], "ko")

plt.plot([proj[0], v[0]], [proj[1], v[1]], "b:")

plt.text(1, 2, "$proj_u v$", color="k", fontsize=18)
plt.text(1.8, 0.2, "$v$", color="b", fontsize=18)
plt.text(0.8, 3, "$u$", color="r", fontsize=18)

plt.axis([0, 8, 0, 5.5])
plt.grid()
plt.show()

Matrices¶

The dot product is not defined for matrices. The dot product is a binary operation that is defined for two vectors and it returns a scalar value. Matrices are not vectors, they are arrays of numbers arranged in a specific format (rows and columns) and they are used to represent linear transformations.

Matrix multiplication, on the other hand, is a binary operation defined for two matrices and it returns a new matrix. The operation is defined such that the resulting matrix has a given row from the first matrix, multiplied by each column of the second matrix. The resulting matrix will have the same number of rows as the first matrix and the same number of columns as the second matrix.

For example; \begin{bmatrix} 10 & 20 & 30 \\ 40 & 50 & 60 \end{bmatrix}

A 2D vector is a vector that has two components, commonly represented by (x, y) or (x1, x2), and it lives in a 2-dimensional space. It can be represented graphically as a point in a 2D plane, with the x-coordinate indicating the horizontal position, and the y-coordinate indicating the vertical position.

A 3D vector is a vector that has three components, commonly represented by (x, y, z) or (x1, x2, x3), and it lives in a 3-dimensional space. It can be represented graphically as a point in a 3D space, with the x, y, and z coordinates indicating the position along the three axes.

Matrices in python¶

In [28]:
[
    [10, 20, 30],
    [40, 50, 60]
]
Out[28]:
[[10, 20, 30], [40, 50, 60]]

Another efficent way is to use NumPy function. There are many matrix operations:

In [29]:
A = np.array([
    [10,20,30],
    [40,50,60]
])
A
Out[29]:
array([[10, 20, 30],
       [40, 50, 60]])

By convention matrices generally have uppercase names, such as $A$

NumPy arrays(type ndarray) to represent matrices.

Size¶

The size of a vector, also known as its magnitude or norm, is a scalar value that represents the length of the vector. The size of a vector is important in many mathematical and applied fields, as it is used to measure the distance between two vectors, the angle between two vectors, and the projection of one vector onto another.

In the case of 2D vectors, the size of a vector can be calculated using the Pythagorean theorem:

$$ ||(x, y)|| = √(x^2 + y^2)$$

Where x and y are the components of the vector.

For 3D vectors, the size of a vector can be calculated using the same principle:

$$ ||(x, y, z)|| = √(x^2 + y^2 + z^2) $$

It is also important to note that the size of a vector is always non-negative, and it is zero only if the vector is the zero vector.

In [30]:
A.shape
Out[30]:
(2, 3)

Caution: the size attribute represents the number of elements in the ndarray, not the matrix's size:

Element indexing¶

In linear algebra, element indexing is the process of accessing individual elements of a vector or a matrix using an index or a set of indices. The elements of a vector are often indexed by a single integer, while the elements of a matrix are indexed by a pair of integers, one for the row and one for the column.

The indexing of a vector is usually starting from 1 to n, where n is the dimension of the vector. For example, the first element of a 3D vector is v1, the second element is v2, and the third element is v3.

The indexing of a matrix is usually starting from 1 to m, where m is the number of rows and n is the number of columns. For example, the element at the first row and first column of a 2x3 matrix is a11, the element at the second row and first column is a21, and the element at the second row and third column is a23

Another example ; $$ X = (x_{i,j})_{1 ≤ i ≤ m, 1 ≤ j ≤ n} $$ this means that X is equal to :

$$ X = \begin{bmatrix} x_{1,1} & x_{1,2} & x_{1,3} & \cdots & x_{1,n}\\ x_{2,1} & x_{2,2} & x_{2,3} & \cdots & x_{2,n}\\ x_{3,1} & x_{3,2} & x_{3,3} & \cdots & x_{3,n}\\ \vdots & \vdots & \vdots & \ddots & \vdots \\ x_{m,1} & x_{m,2} & x_{m,3} & \cdots & x_{m,n}\\ \end{bmatrix}$$
In [31]:
A[1,2]  # 2nd row, 3rd column
Out[31]:
60

Square, triangular, diagonal and identity matrices¶

A square matrix is a matrix that has the same number of rows and columns, for example a $3 \times 3$ \begin{bmatrix} 4 & 9 & 2 \\ 3 & 5 & 7 \\ 8 & 1 & 6 \end{bmatrix}

An upper triangular matrix is a special kind of square matrix where all the elements below the main diagonal (top-left to bottom-right) are zero, for example:

\begin{bmatrix} 4 & 9 & 2 \\ 0 & 5 & 7 \\ 0 & 0 & 6 \end{bmatrix}

Similarly, a lower triangular matrix is a square matrix where all elements above the main diagonal are zero, for example:

\begin{bmatrix} 4 & 0 & 0 \\ 3 & 5 & 0 \\ 8 & 1 & 6 \end{bmatrix}

A triangular matrix is one that is either lower triangular or upper triangular.

A matrix that is both upper and lower triangular is called a diagonal matrix, for example:

\begin{bmatrix} 4 & 0 & 0 \\ 0 & 5 & 0 \\ 0 & 0 & 6 \end{bmatrix}

And we can use diogonal matrix throgh NumPy's diag function:

In [32]:
np.diag([4, 5, 6])
Out[32]:
array([[4, 0, 0],
       [0, 5, 0],
       [0, 0, 6]])

If you pass a matrix to the diag function, it will happily extract the diagonal values:

In [33]:
D = np.array([
        [1, 2, 3],
        [4, 5, 6],
        [7, 8, 9],
    ])
np.diag(D)
Out[33]:
array([1, 5, 9])

Finally, the identity matrix of size $n$, noted $I_n$ is a diagonal matrix of size $n \times n$ in the main diagonal, for example $I_3$

\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}

NumPy's eye function returns the identity matrix of the desired size:

In [34]:
np.eye(3)
Out[34]:
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

The identity matrix is often noted simply$I$ (instead of$I_n$ ) when its size is clear given the context. It is called the identity matrix because multiplying a matrix with it leaves the matrix unchanged as we will see below.

Adding matrices¶

To add matrices, the matrices must have the same dimensions (i.e. the same number of rows and columns). To add two matrices A and B, you add the corresponding entries together. For example, if A = [a11, a12, a13] and B = [b11, b12, b13], then A + B = [a11 + b11, a12 + b12, a13 + b13].

Another example; two matrices $Q$ and $R$ have the common size $m$ $x$ $n$ matrix $S$ where each elements are total of the elements. Corresponding position : $S_{i,j} = Q_{i,j} + R_{i,j}$

S = \begin{bmatrix} Q_{11} + R_{11} & Q_{12} + R_{12} & Q_{13} + R_{13} & \cdots & Q_{1n} + R_{1n} \\ Q_{21} + R_{21} & Q_{22} + R_{22} & Q_{23} + R_{23} & \cdots & Q_{2n} + R_{2n} \\ Q_{31} + R_{31} & Q_{32} + R_{32} & Q_{33} + R_{33} & \cdots & Q_{3n} + R_{3n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ Q_{m1} + R_{m1} & Q_{m2} + R_{m2} & Q_{m3} + R_{m3} & \cdots & Q_{mn} + R_{mn} \\ \end{bmatrix}

Lets create a $2$ $x$ $3$ matric $B$ and compute $A$ $+$ $B$:

In [35]:
B = np.array([[1,2,3], [4, 5, 6]])
B
Out[35]:
array([[1, 2, 3],
       [4, 5, 6]])
In [36]:
A
Out[36]:
array([[10, 20, 30],
       [40, 50, 60]])
In [37]:
A + B
Out[37]:
array([[11, 22, 33],
       [44, 55, 66]])

it means that 'commutative'= $A$ $+$ $B$ $=$ $B$ $+$ $A$

In [38]:
B+A
Out[38]:
array([[11, 22, 33],
       [44, 55, 66]])

it means that 'associative' = $A + (B + C) = (A + B) + C$

In [39]:
C = np.array([[100,200,300], [400, 500, 600]])

A + (B + C)
Out[39]:
array([[111, 222, 333],
       [444, 555, 666]])

Scalar multiplication¶

Scalar multiplication is the operation of multiplying a matrix by a scalar (a single value). To perform scalar multiplication on a matrix, you simply multiply each entry in the matrix by the scalar. For example, if $A = [a11, a12, a13]$ and k is a scalar, then $kA = [ka11, ka12, ka13]$. It is distributive over addition and associative. In other words, $k(A+B) = kA + kB and (k1*k2)A = k1(k2A)$

For insurance; $M$ can multiplied by a scalar $\lambda$. The results are noted $\lambda M$

\lambda M = \begin{bmatrix} \lambda \times M_{11} & \lambda \times M_{12} & \lambda \times M_{13} & \cdots & \lambda \times M_{1n} \\ \lambda \times M_{21} & \lambda \times M_{22} & \lambda \times M_{23} & \cdots & \lambda \times M_{2n} \\ \lambda \times M_{31} & \lambda \times M_{32} & \lambda \times M_{33} & \cdots & \lambda \times M_{3n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \lambda \times M_{m1} & \lambda \times M_{m2} & \lambda \times M_{m3} & \cdots & \lambda \times M_{mn} \\ \end{bmatrix}

more succinct way of writing are; $(\lambda M)_{i,j} = \lambda (M)_{i,j}$

In NumPy, simply use the $*$ operator to multiply a matrix by a scalar. For example:

In [40]:
2 * A
Out[40]:
array([[ 20,  40,  60],
       [ 80, 100, 120]])

Scalar multiplication is also defined on the right hand side, and gives the same result:$M \lambda = \lambda M$. For insurance:

In [41]:
A * 2
Out[41]:
array([[ 20,  40,  60],
       [ 80, 100, 120]])

multiplication commutative

And it is associative, that meaning: $\alpha (\beta M) = (\alpha \times \beta) M$, where $\alpha$ and $\beta $ are scalars.

In [42]:
2 * (3 * A)
Out[42]:
array([[ 60, 120, 180],
       [240, 300, 360]])
In [43]:
(2 * 3) * A
Out[43]:
array([[ 60, 120, 180],
       [240, 300, 360]])

Finally, it is distributive over addition of matrices, meaning that $\lambda (Q + R) = \lambda Q + \lambda R$

In [44]:
2 * (A + B)
Out[44]:
array([[ 22,  44,  66],
       [ 88, 110, 132]])
In [45]:
2 * A + 2 * B
Out[45]:
array([[ 22,  44,  66],
       [ 88, 110, 132]])

Matrix multiplication¶

Matrix multiplication, also known as matrix product, is a binary operation that takes a pair of matrices, and produces another matrix. It is defined only if the number of columns of the first matrix is equal to the number of rows of the second matrix. To multiply a matrix A (of size mxn) and a matrix B (of size nxp) the entry in row i and column j of the resulting matrix C is obtained by taking the dot product of the i-th row of matrix A and the j-th column of matrix B. $C(i,j) = Σ (A(i,k) * B(k,j))$ where k runs from 1 to n Matrix multiplication is not commutative and distributive over addition but it is associative i.e. $A(BC) = (AB)C$ It has many applications in linear algebra, computer graphics, physics, statistics, and many other fields.

$P_{i,j} = \sum_{k=1}^n{Q_{i,k} \times R_{k,j}}$

The element position $i,j$ in the resulting matrix is the sum of the products of elements in row $i$ of matrix $Q$ by the elements in column $j$ of matrix $R$

P = \begin{bmatrix} Q_{11} R_{11} + Q_{12} R_{21} + \cdots + Q_{1n} R_{n1} & Q_{11} R_{12} + Q_{12} R_{22} + \cdots + Q_{1n} R_{n2} & \cdots & Q_{11} R_{1q} + Q_{12} R_{2q} + \cdots + Q_{1n} R_{nq} \\ Q_{21} R_{11} + Q_{22} R_{21} + \cdots + Q_{2n} R_{n1} & Q_{21} R_{12} + Q_{22} R_{22} + \cdots + Q_{2n} R_{n2} & \cdots & Q_{21} R_{1q} + Q_{22} R_{2q} + \cdots + Q_{2n} R_{nq} \\ \vdots & \vdots & \ddots & \vdots \\ Q_{m1} R_{11} + Q_{m2} R_{21} + \cdots + Q_{mn} R_{n1} & Q_{m1} R_{12} + Q_{m2} R_{22} + \cdots + Q_{mn} R_{n2} & \cdots & Q_{m1} R_{1q} + Q_{m2} R_{2q} + \cdots + Q_{mn} R_{nq} \end{bmatrix}

We may notice that each element $Pij$ is the dot product of the row vector $Qi*$ and the column vector $R*j$

$P_{i,j} = Q_{i,*} \cdot R_{*,j}$

So we can rewrite $P$ more preciesly as :

P = \begin{bmatrix} Q_{1,*} \cdot R_{*,1} & Q_{1,*} \cdot R_{*,2} & \cdots & Q_{1,*} \cdot R_{*,q} \\ Q_{2,*} \cdot R_{*,1} & Q_{2,*} \cdot R_{*,2} & \cdots & Q_{2,*} \cdot R_{*,q} \\ \vdots & \vdots & \ddots & \vdots \\ Q_{m,*} \cdot R_{*,1} & Q_{m,*} \cdot R_{*,2} & \cdots & Q_{m,*} \cdot R_{*,q} \end{bmatrix}

Let's multiply two matrices in NumPy, using ndarray's dot method:

$E = AD =$ \begin{bmatrix} 10 & 20 & 30 \\ 40 & 50 & 60 \end{bmatrix} \begin{bmatrix} 2 & 3 & 5 & 7 \\ 11 & 13 & 17 & 19 \\ 23 & 29 & 31 & 37 \end{bmatrix} = \begin{bmatrix} 930 & 1160 & 1320 & 1560 \\ 2010 & 2510 & 2910 & 3450 \end{bmatrix}

In [46]:
D = np.array([
        [ 2,  3,  5,  7],
        [11, 13, 17, 19],
        [23, 29, 31, 37]
    ])
E = A.dot(D)
E
Out[46]:
array([[ 930, 1160, 1320, 1560],
       [2010, 2510, 2910, 3450]])

Converting 1D arrays to 2D arrays in NumPy¶

In NumPy, a 1D array can be converted to a 2D array using the reshape() method. For example, to convert a 1D array of shape (n,) to a 2D array of shape (m,n), where m and n are integers, you can use the following code:

Create a 1D array

In [47]:
import numpy as np
arr_1d = np.array([1, 2, 3, 4, 5])

Convert 1D array to 2D array of shape (1, n)

In [48]:
arr_2d = arr_1d.reshape(1, -1)
print(arr_2d)
[[1 2 3 4 5]]

Alternatively, you can use the newaxis keyword to convert a 1D array to a 2D array:

Create a 1D array

In [49]:
arr_1d = np.array([1, 2, 3, 4, 5])

Convert 1D array to 2D array of shape (n,1)

In [50]:
arr_2d = arr_1d[np.newaxis, :]
print(arr_2d)
[[1 2 3 4 5]]

The reshape() method can also be used to convert a 1D array to a 2D array of any shape (m, n), where m and n are integers, as long as the total number of elements in the original 1D array is equal to the total number of elements in the desired 2D array.

Plotting a matrix¶

In NumPy, a matrix can be plotted using the matplotlib library. Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy.

Lets create a $2$ $x$ $4$ matrix $P$ and plot

In [51]:
P = np.array([
        [3.0, 4.0, 1.0, 4.6],
        [0.2, 3.5, 2.0, 0.5]
    ])
x_coords_P, y_coords_P = P
plt.scatter(x_coords_P, y_coords_P)
plt.axis([0, 5, 0, 4])
plt.show()

Since the vectors are ordered, you can see the matrix as a path and represent it with connected dots:

In [52]:
plt.plot(x_coords_P, y_coords_P, "bo")
plt.plot(x_coords_P, y_coords_P, "b--")
plt.axis([0, 5, 0, 4])
plt.grid()
plt.show()

Another efficient way is you can use the Polygon class expects an $n$ $x$ $2$ NumPy array, not a $2$ $x$ $n$ array and we should only give it $P^T$

In [53]:
from matplotlib.patches import Polygon
plt.gca().add_artist(Polygon(P.T))
plt.axis([0, 5, 0, 4])
plt.grid()
plt.show()

Geometric applications of matrix operations¶

In geometric applications, matrix operations can be used to represent and perform various transformations on geometric objects such as points, vectors, and shapes. Some common examples of transformations that can be represented by matrix operations include:

Translation: A translation can be represented by a matrix that shifts a geometric object by a specified amount in the x and y direction. For example, the matrix $[[1,0,a],[0,1,b],[0,0,1]]$ can be used to translate a point (x, y) by a units in the x-direction and b units in the y-direction.

Scaling: A scaling can be represented by a matrix that increases or decreases the size of a geometric object by a specified factor in the x and y direction. For example, the matrix $[[a,0,0],[0,b,0],[0,0,1]]$ can be used to scale a point (x, y) by a factor of a in the x-direction and a factor of b in the y-direction.

Rotation: A rotation can be represented by a matrix that rotates a geometric object by a specified angle around the origin. For example, the matrix $[[cos(theta),-sin(theta),0],[sin(theta),cos(theta),0],[0,0,1]]$ can be used to rotate a point (x, y) around the origin by an angle of theta.

Shear: A shear can be represented by a matrix that skews a geometric object by a specified amount in the x and y direction. For example, the matrix $[[1,k,0],[h,1,0],[0,0,1]]$ can be used to shear a point (x, y) by k units in the x-direction and h units in the y-direction.

These are some examples of how matrix operations can be used in geometric applications, but it's worth noting that there are many other types of transformations that can be represented by matrix operations as well as other applications such as linear algebra, computer graphics, and physics simulations.

Scalar multiplication¶

Scalar multiplication is the operation of multiplying a scalar value by a matrix or a vector. In the case of a matrix, each element in the matrix is multiplied by the scalar value. In the case of a vector, each element in the vector is multiplied by the scalar value.

In mathematical notation, scalar multiplication is represented by placing the scalar value before the matrix or vector. For example, if A is a matrix and c is a scalar value, then the scalar multiplication of A and c is represented as cA. Similarly, if x is a vector and c is a scalar value, then the scalar multiplication of x and c is represented as cx.

For insurance, Lets create an polygon with a factor of 60%

In [54]:
def plot_transformation(P_before, P_after, text_before, text_after, axis = [0, 5, 0, 4], arrows=False):
    if arrows:
        for vector_before, vector_after in zip(P_before.T, P_after.T):
            plot_vector2d(vector_before, color="blue", linestyle="--")
            plot_vector2d(vector_after, color="red", linestyle="-")
    plt.gca().add_artist(Polygon(P_before.T, alpha=0.2))
    plt.gca().add_artist(Polygon(P_after.T, alpha=0.3, color="r"))
    plt.text(P_before[0].mean(), P_before[1].mean(), text_before, fontsize=18, color="blue")
    plt.text(P_after[0].mean(), P_after[1].mean(), text_after, fontsize=18, color="red")
    plt.axis(axis)
    plt.grid()

P_rescaled = 0.60 * P
plot_transformation(P, P_rescaled, "$P$", "$0.6 P$", arrows=True)
plt.show()

It looks like it will continue to do so¶

$Study$ $Reference =$ https://souravsengupta.com/cds2016/lectures/Savov_Notes.pdf

Ahmet Mahmut Gökkaya

  • gokkaya[thiswebsite]
  • ahmetmahmutgokkaya
  • Profile Icon ahmetmahmutgokkaya
  • ahmetmahmutgokkaya